Search CORE

Scholar Commons - Institutional Repository of the University of South Carolina

Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information

Author: A Porollo
AJ Bordner
B Wang
B Wang
BD Alberts
C Cortes
C Sander
ED Levy
F Glaser
F Pazos
H Chen
H Zhou
HM Berman
HS Wong
I Ezkurdia
I Res
J Chung
J Janin
J Kittler
J Kyte
J Mihel
JC Bezdek
Jinyan Li
JR Bradford
JR Bradford
KS Thorn
L Lo Conte
LI Kuncheva
LK Hansen
M Charton
M Guharoy
M Sikic
N H
P Baldi
P Chakrabarti
P Chen
P Cherepanov
P Cherepanov
P Fariselli
Peng Chen
Q Dong
R Singh
RA Laskowski
RD Pascual-Marqui
RM Kini
RP Bahadur
RP Bahadur
S Jones
S Jones
S Jones
SJ de Vries
T Friedrich
T Kohonen
TA Larsen
TJ Bollenbach
Uni-Prot-Consortium
V Chelliah
W Kauzmann
X Du
X Gallet
XW Chen
Y Murakami
Y Ofran
Y Ofran
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Protein-protein interactions play essential roles in protein function determination and drug design. Numerous methods have been proposed to recognize their interaction sites, however, only a small proportion of protein complexes have been successfully resolved due to the high cost. Therefore, it is important to improve the performance for predicting protein interaction sites based on primary sequence alone. Results We propose a new idea to construct an integrative profile for each residue in a protein by combining its hydrophobic and evolutionary information. A support vector machine (SVM) ensemble is then developed, where SVMs train on different pairs of positive (interface sites) and negative (non-interface sites) subsets. The subsets having roughly the same sizes are grouped in the order of accessible surface area change before and after complexation. A self-organizing map (SOM) technique is applied to group similar input vectors to make more accurate the identification of interface residues. An ensemble of ten-SVMs achieves an MCC improvement by around 8% and F1 improvement by around 9% over that of three-SVMs. As expected, SVM ensembles constantly perform better than individual SVMs. In addition, the model by the integrative profiles outperforms that based on the sequence profile or the hydropathy scale alone. As our method uses a small number of features to encode the input vectors, our model is simpler, faster and more accurate than the existing methods. Conclusions The integrative profile by combining hydrophobic and evolutionary information contributes most to the protein-protein interaction prediction. Results show that evolutionary context of residue with respect to hydrophobicity makes better the identification of protein interface residues. In addition, the ensemble of SVM classifiers improves the prediction performance. Availability Datasets and software are available at <url>http://mail.ustc.edu.cn/~bigeagle/BMCBioinfo2010/index.htm</url>.</p

Springer - Publisher Connector

OPUS - University of Technology Sydney

HemeBIND: a novel method for heme binding residue prediction by combining structural and sequence information

Author: A Armon
A Pintar
A Pintar
A Smith
A Yamaguchi
AJ Bordner
AT Laurie
B Huang
B Rost
C Fufezan
CJ Reedy
DG Levitt
DT Jones
F Glaser
FP Guengerich
GJ Bartlett
HB Gray
HR Ansari
HX Zhou
IB Kuznetsov
J Liang
J Mihel
JA Capra
JC Nebel
Jianjun Hu
JS Chauhan
JS Chauhan
JS Sodhi
LJ Smith
M Brylinski
M Hendlich
M Paoli
M Weisel
N Igarashi
NB Terwilliger
NK Mishra
O Schueler-Furman
RA Laskowski
Rong Liu
RR Thangudu
S Henrich
S Jones
S Schneider
SF Altschul
SM Mense
T Guo
T Pupko
V Sobolev
V Sobolev
VN Vapnik
W De Laurentis
W Kabsch
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues. Results Here we introduced an efficient algorithm HemeBIND for predicting heme binding residues by integrating structural and sequence information. We systematically investigated the characteristics of binding interfaces based on a non-redundant dataset of heme-protein complexes. It was found that several sequence and structural attributes such as evolutionary conservation, solvent accessibility, depth and protrusion clearly illustrate the differences between heme binding and non-binding residues. These features can then be separately used or combined to build the structure-based classifiers using support vector machine (SVM). The results showed that the information contained in these features is largely complementary and their combination achieved the best performance. To further improve the performance, an attempt has been made to develop a post-processing procedure to reduce the number of false positives. In addition, we built a sequence-based classifier based on SVM and sequence profile as an alternative when only sequence information can be used. Finally, we employed a voting method to combine the outputs of structure-based and sequence-based classifiers, which demonstrated remarkably better performance than the individual classifier alone. Conclusions HemeBIND is the first specialized algorithm used to predict binding residues in protein structures for heme ligands. Extensive experiments indicated that both the structure-based and sequence-based methods have effectively identified heme binding residues while the complementary relationship between them can result in a significant improvement in prediction performance. The value of our method is highlighted through the development of HemeBIND web server that is freely accessible at <url>http://mleg.cse.sc.edu/hemeBIND/</url>.</p

Springer - Publisher Connector

Scholar Commons - Institutional Repository of the University of South Carolina

Incorporation of protein binding effects into likelihood ratio test for exome sequencing data

Author: A Ceol
AG Vandell
B Li
BM Neale
C Dering
DI Chasman
Dmitry Korkin
Dongni Zhang
F Pedregosa
FA San Lucas
GT Wang
Hongzhu Cui
J Mihel
K Wang
K-S Lynn
KA Ross
L Gordon
L Yang
L Zhou
M Sikić
N Zhao
P Gassó
R Bergholdt
S Liang
S Morgenthaler
S Sharif
SA Founds
TA Manolio
W Kabsch
Wellcome Trust Case Control Consortium
Y Murakami
Y Yamada
Y-C Chen
YI Ingster
Zheyang Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Comparative Assessment of Data Sets of Protein Interaction Hot Spots Used in the Computational Method

Author: B.C. Cunningham
J. Mihel
J.-F. Xia
K.-I. Cho
K.S. Thorn
L. Wang
L. Ye
N. Tuncbag
Q. Liu
Q. Nguyen
S.J. Darnell
T. Fischer
T. Kortemme
X. Zhu
Z.-P. Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

A resource for benchmarking the usefulness of protein structure models

Author: A Bairoch
A Fiser
A Giorgetti
A Kryshtafovych
A Sali
A Schreyer
A Zemla
AK Arakaki
Anna Tramontano
C Chothia
CT Porter
D Raimondo
Daniel Carbajo
FC Bernstein
G Wang
GJ Bartlett
ID Kuntz
J Mihel
J Moult
J Soding
M Brylinski
MA Marti-Renom
N Eswar
O Goldenberg
OM Becker
R Linding
RA Laskowski
S Ekins
S Miller
W Kabsch
Publication venue: BMC
Publication date: 01/01/2012
Field of study

Abstract Background Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. Results This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. Conclusions The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. Implementation, availability and requirements Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: <url>http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.</url> Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.</p

Springer - Publisher Connector